Add ChaChaPoly AEAD-4 encryption with nonce persistence#1677
Add ChaChaPoly AEAD-4 encryption with nonce persistence#1677weebl2000 wants to merge 8 commits intomeshcore-dev:devfrom
Conversation
06320d0 to
7f3da6a
Compare
Add ChaCha20-Poly1305 AEAD decryption with 4-byte auth tag for peer messages and group channels, falling back to ECB for backward compatibility. Sending remains ECB-only in this phase. - Per-message key derivation: HMAC-SHA256(secret, nonce||dest||src) - Direction-dependent keys prevent bidirectional keystream reuse - 12-byte IV from nonce + dest_hash + src_hash - Advertise AEAD capability via feat1 bit 0 in adverts - Track peer AEAD support in ContactInfo.flags - Seed aead_nonce from HW RNG on contact creation and load
7f3da6a to
26bdb41
Compare
Send ChaChaPoly-encrypted messages to peers with CONTACT_FLAG_AEAD set, and try AEAD decode first for those peers (avoiding 1/65536 ECB false-positive). Legacy peers continue to use ECB in both directions. - Add aead_nonce parameter to createDatagram/createPathReturn (default 0 = ECB) - Add getPeerFlags/getPeerNextAeadNonce virtual methods for decode-order selection - Add ContactInfo::nextAeadNonce() helper (returns nonce++ if AEAD, 0 otherwise) - Update all BaseChatMesh send paths to pass nonce for AEAD-capable peers - Adaptive decode order: AEAD-first for known AEAD peers, ECB-first for others
eee6fd5 to
6526793
Compare
The header's route type bits (PH_ROUTE_MASK) are zero when createDatagram/createPathReturn encrypt with AEAD, but get changed to ROUTE_TYPE_FLOOD (1) or ROUTE_TYPE_DIRECT (2) by sendFlood/sendDirect afterwards. The receiver builds assoc from the received header (with route bits set), so the tag check always fails and every AEAD packet is silently dropped. Mask out route type bits in assoc data on all 5 encrypt/decrypt sites. Also track AEAD decode success to enable peer capability auto-detection.
881d18d to
7637e64
Compare
I do not understand how this prevents nonce re-use. After 65k messages from A->B the nonce looks like it will be reused. I do not understand why concatenation with src/dst would change this. The concatenation means you are partitioning the nonce value per (uni-directional) flow, in effect running different counters for A->B, B->A and C->A. Right?
What happens for devices without access to a good early boot entropy source? What if two different reboots generate the same nonce? What happens for A->B if:
What does this method improve over a plain incremental counter? Why not persist the nonce once every 100 messages, and on reboot increment by 200 (rounded down to nearest 100)? When the nonce wraps, regenerate the key. |
|
Yeah, it doesn't stop nonce re-use. I think in the end we might need more bytes for nonces. |
You do not, you can also change the key. Just negotiate a dedicated key for this. It is a lot easier to understand and make safe. It would require a round trip but then only need to be done every 65k messages; you could then also share that key for both directions (ie. A->B and B->A). Then when |
Might be a good option. But the protocol will become a bit more complex and brittle. Then again, we can always fallback to ECB if nothing was negotiated. |
jcjones
left a comment
There was a problem hiding this comment.
Not a casual review, but I like the design, and the directionality of the KDF. Good doc comments, too.
|
Thanks for all the comments so far. I will look into them. Just tested this branch with a Heltec v4 repeater and Heltec v4 companion client, and I can confirm communicating between them works using AEAD-4. It's a request for status from the repeater and the repeater response is understood correctly by the client. AEAD-4 Packet Decode VerificationWire FormatSent Packet — REQ (23 bytes)Raw:
Format confirmed AEAD-4: 17 bytes after hashes is not a multiple of 16, ruling out legacy ECB. Received Packet — RESPONSE (70 bytes)Raw:
Note: legacy ECB is structurally possible here (64 bytes is a multiple of 16), but context confirms AEAD-4. Associated DataPer the route-mask fix, assoc data masks out route type bits:
Observations
|
- Fix potential unsigned overflow in createDatagram size check by subtracting constants from MAX_PACKET_PAYLOAD instead of adding to data_len - Add upper-bound validation on src_len and assoc_len in aeadEncrypt and aeadDecrypt - Log peer name on AEAD nonce wraparound for debug builds
Prevent nonce reuse after reboots by persisting per-peer nonce counters to a dedicated /nonces (companion) or /s_nonces (server) file. On dirty reset (power-on, watchdog, brownout), nonces are bumped by NONCE_BOOT_BUMP (100) to cover any unpersisted messages. Clean wakes (deep sleep, software restart) load nonces as-is. - Add nonce persistence to BaseChatMesh (companion) and ClientACL (server) - Add wasDirtyReset() helper to ArduinoHelpers.h for platform-specific reset reason detection (ESP32/NRF52) - Add onBeforeReboot() callback to CommonCLI for pre-reboot nonce flush - Wire nonce persistence into all firmware variants: companion radio, repeater, room server, and sensor - Only clear dirty flag on successful file write
Summary
Adds ChaCha20-Poly1305 (AEAD-4) encryption alongside the existing AES-128-ECB + HMAC-2 scheme. Updated nodes send AEAD-4 to peers that advertise support and fall back to ECB for legacy peers. All nodes can decode both formats. Old nodes continue to work unchanged.
Nonces are persisted to flash so they survive reboots without risk of reuse.
Relates to #259.
What This Means in Practical Terms
The current encryption has a few weaknesses that this PR addresses:
Message tampering is too easy to attempt. The existing 2-byte authentication code means an attacker only needs about 65,000 guesses to forge a valid-looking message. At LoRa speeds that's roughly 9 hours of continuous attempts. The new 4-byte tag raises this to over 4 billion guesses — at LoRa rates, that would take over a century.
Identical messages look identical on the air. The current block cipher (ECB mode) produces the same ciphertext for the same plaintext, which can reveal patterns — for example, you could tell when someone sends the same message twice. The new scheme produces completely different ciphertext every time, even for identical messages.
Addressing fields are now protected. Currently, only the message body is authenticated. With AEAD, the payload type and addressing hashes (which identify sender and recipient) are included in the authentication check, so an attacker cannot swap or modify them without detection. Outer routing fields like TTL and hop path are intentionally left unauthenticated so repeaters can still forward packets through the mesh.
Messages get slightly smaller. ECB pads every message up to a 16-byte boundary, wasting airtime. The new scheme has no padding, so most messages shrink by a few bytes on the wire.
Nothing breaks. Updated nodes send AEAD-4 to peers that advertise support, and fall back to ECB for legacy peers. Old nodes are completely unaffected — they never receive AEAD-4 messages because the sender checks their capability first.
Nodes advertise their capabilities. Updated nodes include a flag in their advertisements saying "I understand the new encryption." When two updated nodes discover each other, they automatically start using AEAD-4 for their communication.
Nonces survive reboots. Per-peer nonce counters are saved to flash periodically and before clean reboots. After a dirty reset (power loss, watchdog, brownout), nonces are bumped forward by a safety margin to guarantee no reuse.
Wire Format
Current ECB:
New AEAD-4 (same position in payload):
Average overhead: ~6 bytes (AEAD) vs ~9.5 bytes (ECB). Most messages get smaller.
Cryptographic Design
Per-message key derivation (eliminates nonce-reuse catastrophe):
Including
dest_hash || src_hashmakes keys direction-dependent — Alice→Bob and Bob→Alice derive different keys even with the same nonce value (for 255/256 peer pairs; the 1/256 where dest_hash == src_hash is a residual limitation of 1-byte hashes).IV construction (12 bytes, from on-wire fields):
Associated data (authenticated but not encrypted):
header || dest_hash || src_hashheader || dest_hashheader || channel_hashRoute type bits are masked out of the header in associated data (
header & ~PH_ROUTE_MASK), since routing mode changes per hop as repeaters forward packets.Nonce management: 16-bit counter per peer, persisted to flash. See "Nonce Persistence" section below.
Nonce Persistence
Nonces are persisted to a dedicated file on flash (
/noncesfor companion radios,/s_noncesfor server firmware).Periodic saves: After every
NONCE_PERSIST_INTERVAL(50) messages to a given peer, the nonce file is written. A dirty flag tracks whether any nonce has advanced since the last save.Clean reboot: Software restarts and deep sleep wakes load the persisted nonces as-is. A
onBeforeReboot()callback in CommonCLI flushes any dirty nonces before the restart.Dirty reboot: Power-on, watchdog, and brownout resets are detected via
wasDirtyReset()(platform-specific:esp_reset_reason()on ESP32,RESETREASregister on NRF52). After a dirty reset, all loaded nonces are bumped forward byNONCE_BOOT_BUMP(100), which is at least 2× the persist interval, guaranteeing that even the worst-case unpersisted nonce is safely skipped.Format: Simple array of
{pub_key_prefix[6], nonce[2]}entries, matched to in-memory contacts/clients on load.Security Comparison
memcmp(timing side-channel)secure_compare(constant-time)Scope
All node types (companion radio, repeater, room server, sensor) support both AEAD-4 decode and AEAD-4 send for peer messages.
Group Message Considerations
Group channels share a single key among all members. With a 2-byte nonce and multiple senders, cross-sender nonce collisions follow the birthday bound (~300 messages for 50% probability on an active channel). A collision leaks
P1 ⊕ P2for that specific message pair via crib-dragging, but:This is mainly beneficial for public/hashtag channels where the PSK is already widely known and the ECB pattern leakage and weak MAC are a greater concern than the bounded nonce collision risk.
Potential future mitigations explored and deferred:
HMAC(channel_secret, sender_pub_key)) — eliminates cross-sender collisions but requires receivers to know all senders' public keys, changing the group security model from "know the PSK = full access" to "know the PSK + sender discovery = access." Ruled out as a usability regression.Decode Order
Adaptive per-peer: for peers with
CONTACT_FLAG_AEADset, try AEAD-4 first then ECB fallback. For unknown/legacy peers, try ECB first then AEAD-4 fallback. This avoids the 1/65536 ECB false-positive rate on AEAD packets (nonce bytes matching truncated HMAC) for known AEAD peers, while minimizing wasted CPU for legacy peers.Capability Advertisement
feat1bit 0 (FEAT1_AEAD_SUPPORT) is set in adverts for all node types (chat, repeater, room, sensor)ContactInfo.flagsbit 1 (CONTACT_FLAG_AEAD)feat1but ignore the value (forward-compatible via existingAdvertDataParser)Files Changed
src/MeshCore.h— AEAD constants (AEAD_TAG_SIZE,AEAD_NONCE_SIZE,CONTACT_FLAG_AEAD,FEAT1_AEAD_SUPPORT,NONCE_PERSIST_INTERVAL,NONCE_BOOT_BUMP)src/Utils.h/src/Utils.cpp—aeadEncrypt()andaeadDecrypt()using ChaChaPolysrc/Mesh.h—getPeerFlags(),getPeerNextAeadNonce()virtuals;aead_nonceparam oncreateDatagram/createPathReturnsrc/Mesh.cpp— AEAD send path increateDatagram/createPathReturn; adaptive try-both decode order per peersrc/helpers/ContactInfo.h—uint16_t aead_noncefield,nextAeadNonce()helpersrc/helpers/BaseChatMesh.h/BaseChatMesh.cpp— Advertise AEAD, track peer capability, AEAD send for all peer message types, nonce persistence (dirty tracking, periodic save, load with boot bump)src/helpers/ClientACL.h/ClientACL.cpp— Server-side AEAD nonce tracking and persistence for repeater/room/sensor clientssrc/helpers/CommonCLI.h/CommonCLI.cpp— Advertise AEAD for repeaters/rooms/sensors;onBeforeReboot()callback for nonce flushsrc/helpers/ArduinoHelpers.h—wasDirtyReset()helper (ESP32/NRF52 reset reason detection)examples/companion_radio/DataStore.h/DataStore.cpp— Nonce file I/O for companion radioexamples/companion_radio/MyMesh.h/MyMesh.cpp— Wire up nonce persistence and reboot callbackexamples/simple_repeater/MyMesh.h/MyMesh.cpp— AEAD send support and nonce persistenceexamples/simple_room_server/MyMesh.h/MyMesh.cpp— AEAD send support and nonce persistenceexamples/simple_sensor/SensorMesh.h/SensorMesh.cpp— AEAD send support and nonce persistenceBuild Verification
Heltec_v3_companion_radio_ble): builds successfullyXiao_nrf52_companion_radio_ble): builds successfullyFuture Work